Diathesis alternation approximation for verb clustering
نویسندگان
چکیده
Although diathesis alternations have been used as features for manual verb classification, and there is recent work on incorporating such features in computational models of human language acquisition, work on large scale verb classification has yet to examine the potential for using diathesis alternations as input features to the clustering process. This paper proposes a method for approximating diathesis alternation behaviour in corpus data and shows, using a state-of-the-art verb clustering system, that features based on alternation approximation outperform those based on independent subcategorization frames. Our alternation-based approach is particularly adept at leveraging information from less frequent data.
منابع مشابه
Acquiring Lexical Generalizations from Corpora: A Case Study for Diathesis Alternations
This paper examines the extent to which verb diathesis alternations are empirically attested in corpus data. We automatically acquire alternating verbs from large balanced corpora by using partialparsing methods and taxonomic information, and discuss how corpus data can be used to quantify linguistic generalizations. We estimate the productivity of an alternation and the typicality of its membe...
متن کاملUsing Semantic Preferences to Identify Verbal Participation in Role Switching Alternations
We propose a method for identifying diathesis alternations where a particular argument type is seen in slots which have different grammatical roles in the alternating forms. The method uses selectional preferences acquired as probability distributions over WordNet. Preferences for the target slots are compared using a measure of distributional similarity. The method is evaluated on the causativ...
متن کاملProbabilistic Models of Verb-Argument Structure
We evaluate probabilistic models of verb argument structure trained on a corpus of verbs and their syntactic arguments. Models designed to represent patterns of verb alternation behavior are compared with generic clustering models in terms of the perplexity assigned to held-out test data. While the specialized models of alternation do not perform as well, closer examination reveals alternation ...
متن کاملClustering Hungarian Verbs on the Basis of Complementation Patterns
Our paper reports an attempt to apply an unsupervised clustering algorithm to a Hungarian treebank in order to obtain semantic verb classes. Starting from the hypothesis that semantic metapredicates underlie verbs’ syntactic realization, we investigate how one can obtain semantically motivated verb classes by automatic means. The 150 most frequent Hungarian verbs were clustered on the basis of ...
متن کاملGermaNet Synsets as Selectional Preferences in Semantic Verb Clustering
WordNet and its German version GermaNet have widely been used as source for fine-grained selectional preference information, focusing on but not restricted to verb-object relationships (Resnik, 1997; Ribas, 1995; Li and Abe, 1998; Abney and Light, 1999; Wagner, 2000; McCarthy, 2001; Clark and Weir, 2002). In contrast, this paper presents an approach where argument slots of variable verb-frame c...
متن کامل